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5 STATEMENT OF FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT 
[0004] Not Applicable 
BACKGROUND OF THE INVENTION 

1. TECHNICAL FIELD 

[0005] This invention relates in general to telecommunications and, more 
10 particularly, to a method and apparatus for optical switching. 

2. DESCRIPTION OF THE RELATED ART 

[0006] Data traffic over networks, particularly the Internet, has increased 
dramatically recently, and will continue as the user increase and new services 
requiring more bandwidth are introduced. The increase in Internet traffic 
15 requires a network with high capacity routers capable of routing data packets of 
variable length. One option is the use of optical networks. 

[0007] The emergence of dense- wavelength division multiplexing (DWDM) 
technology has improved the bandwidth problem by increasing the capacity of 
an optical fiber. However, the increased capacity creates a serious mismatch 

20 with current electronic switching technologies that are capable of switching data 
rates up to a few gigabits per second, as opposed to the multiple terabit per 
second capability of DWDM. While emerging ATM switches and IP routers can 
be used to switch data using the individual channels within a fiber, typically at a 
few hundred gigabits per second, this approach implies that tens or hundreds of 

25 switch interfaces must be used to terminate a single DWDM fiber with a large 
number of channels. This could lead to a significant loss of statistical 
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multiplexing efficiency when the parallel channels are used simply as a collection 
of independent links, rather than as a shared resource. 

[0008] Different approaches advocating the use of optical technology in place 
of electronics in switching systems have been proposed; however, the limitations 
of optical component technology has largely limited optical switching to facility 
management/ control applications. One approach, called optical burst-switched 
networking, attempts to make the best use of optical and electronic switching 
technologies. The electronics provides dynamic control of system resources by 
assigning individual user data bursts to channels of a DWDM fiber, while optical 
technology is used to switch the user data channels entirely in the optical 
domain. 

[0009] Previous optical networks designed to directly handle end-to-end user 
data channels have been disappointing. 

[0010] Therefore, a need has arisen for a method and apparatus for providing 
an optical burst-switched network. 
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BRIEF SUMMARY OF THE INVENTION 

[0011] In the present invention, an optical burst-switched router is provided 
using an optical switch for routing optical information from an incoming optical 
transmission medium to one of a plurality of outgoing optical transmission 
media, wherein each of the outgoing optical transmission media can transmit 
data over a plurality of channels. A group identifier is assigned to each channel. 
Scheduling circuits associated with respective outgoing optical transmission 
media include one or more associative processors storing information indicative 
of times available for scheduling a data burst on the associated outgoing optical 
transmission medium and circuitry for controlling the one or more associative 
processors to find an available time on one of a plurality of channels associated 
with a predetermined group identifier. 

[0012] The present invention provides flexible scheduling of bursts using 
specific groups of channels for testing and other purposes. 
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BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS 

[0013] For a more complete understanding of the present invention, and the 
advantages thereof, reference is now made to the following descriptions taken in 
conjunction with the accompanying drawings, in which: 

[0014] Figure la is a block diagram of an optical network; 

[0015] Figure lb is a block diagram of a core optical router; 

[0016] Figure 2 illustrates a data flow of the scheduling process; 

[0017] Figure 3 illustrates a block diagram of a scheduler; 

[0018] Figures 4a and 4b illustrate timing diagrams of the arrival of a burst 
header packet relative to a data burst; 

[0019] Figure 5 illustrates a block diagram of a DCS module; 

[0020] Figure 6 illustrates a block diagram of the associative memory of Pm; 

[0021] Figure 7 illustrates a block diagram of the associative memory of Pg; 

[0022] Figure 8 illustrates a flow chart of a LAUC-VF scheduling method; 

[0023] Figure 9 illustrates a block diagram of a CCS module; 

[0024] Figure 10 illustrates a block diagram of the associative memory of Pt; 

[0025] Figure 11 illustrates a flow chart of a constrained earliest time method 
of scheduling the control channel; 

[0026] Figure 12 illustrates a block diagram of the path & channel selector; 

[0027] Figure 13 illustrates a example of a blocked output channel through 
the recirculation buffer; 
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[0028] Figure 14 illustrates a memory configuration for a memory of the BHP 
transmission module; 

[0029] Figure 15 illustrates a block diagram of an optical router architecture 
using passive FDL loops; 

5 [0030] Figure 16 illustrates an example of a path «& channel scheduler with 
multiple Pm and Pc pairs; 

[0031] Figures 17a and 17b illustrate timing diagram of outbound data 
channels; 

[0032] Figure 18 illustrates clock signals for CLK/ and CLKs; 

10 [0033] Figure 19a and 19b illustrate alternative hardware modifications for 
slotted operation of a router; 

[0034] Figure 20 illustrates a block diagram of an associative processor Pm; 

[0035] Figure 21 illustrates a block diagram of an associative processor Pg; 

[0036] Figure 22 illustrates a block diagram of an associative processor Pmg; 

15 [0037] Figure 23 illustrates a block diagram of an associative processor P mg; 

[0038] Figure 24 illustrates a block diagram of an embodiment using multiple 
associative processors for fast scheduling; 

[0039] Figure 25 illustrates a block diagram of a processor Pu-ext for use with 
multiple channel groups; and 

20 [0040] Figure 26 illustrates a block diagram of a processor Pc-ext for use with 
multiple channel groups. 
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DETAILED DESCRIPTION OF THE INVENTION 

[0041] The present invention is best understood in relation to Figures 1 - 26 of 
the drawings, like numerals being used for like elements of the various 
drawings. 

5 [0042] Figure la illustrates a general block diagram of an optical burst 
switched network 4. The optical burst switched (OBS) network 4 includes 
multiple electronic ingress edge routers 6 and multiple egress edge routers 8. 
The ingress edge routers 6 and egress edge routers 8 are coupled to multiple core 
optical routers 10. The connections between ingress edge routers 6, egress edge 



j\ basic data block to be transferred through the network 4. Ingress edge routers 6 

J and egress edge routers 8 are responsible for burst assembly and disassembly 

I** 15 functions, and serve as legacy interfaces between the optical burst switched 

lii 

il'l network 4 and conventional electronic routers. 



[0044] Within the optical burst switched network 4, the basic data block to be 
transferred is a burst, which is a collection of packets having some common 
attributes. A burst consists of a burst payload (called "data burst") and a burst 

20 header (called "burst header packet" or BHP). An intrinsic feature of the optical 
burst switched network is that a data burst and its BHP are transmitted on 
different channels and switched in optical and electronic domains, respectively, 
at each network node. The BHP is sent ahead of its associated data burst with an 
offset time t (> 0) . Its initial value, Tq , is set by the (electronic) ingress edge 

25 router 8. 




routers 8 and core routers 10 are made using optical links 12. Each optical fiber 
can carry multiple channels of optical data. 



[0043] In operation, a data burst (or simply "burst") of optical data is the 
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[0045] In this invention, a "channel" is defined as a certain unidirectional 
transmission capacity (in bits per second) between two adjacent routers. A 
channel may consist of one wavelength or a portion of a wavelength (e.g., when 
time-division multiplexing is used). Channels carrying data bursts are called 
5 "data channels", and channels carrying BHPs and other control packets are 

called "control channels". A "channel group" is a set of channels with a common 
type and node adjacency. A link is defined as a total transmission capacity 
between two routers, which usually consists of a "data channel group" (DCG) 
and a "control channel group" (CCG) in each direction. 

if^ 10 [0046] Figure lb illustrates a block diagram of a core optical router 10. The 
ill incoming DCG 14 is separated from the CCG 16 for each fiber 12 by 
^4 demultiplexer 18. Each DCG 14 is delayed by a fiber delay line (FDL) 19. The 
jj delayed DCG is separated into channels 20 by demultiplexer 22. Each channel 20 

is input to a respective input node on a non-blocking spatial switch 24. 
r'* 15 Additional input and output nodes of spatial switch 24 are coupled to a 
fll recirculation buffer (RB) 26. Recirculation buffer 26 is controlled by a 
?| recirculation switch controller 28. Spatial switch 24 is controlled by a spatial 

switch controller 30. 

[0047] CCGs 14 are coupled to a switch control unit (SCU) 32. SCU includes 
20 an optical/ electronic transceiver 34 for each CCG 14. The optical/ electronic 
transceiver 34 receives the optical CCG control information and converts the 
optical information into electronic signals. The electronic CCG information is 
received by a packet processor 36, which passes information to a forwarder 38. 
The forwarder for each CCG is coupled to a switch 40. The output nodes of 
25 switch 40 are coupled to respective schedulers 42. Schedulers 42 are coupled to a 
Path & Charmel Selector 44 and to respective BHF transmit modules 46. The 
BHP transmit modules 46 are coupled to electronic/ optical transceivers 48. The 
electronic/ optical transceivers produce the output CCG 52 to be combined with 
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the respective output DCG 54 information by multiplexer 50. Path & channel 
selector 44 is also coupled to RB switch controller 28 and spatial switch controller 
30. 

[0048] The embodiment shown in Figure lb has N input DCG-CCG pairs and 
5 N output DCG-CCG pairs 52, where each DCG has K channels and each CCG has 
only one channel (k=l). A DCG-CCG pair 52 is carried in one fiber. In general, 
the optical router could be asymmetric, the number of channels A; of a CCG 16 
could be larger than one, and a DCG-CCG pair 52 could be carried in more than 
one fiber 12. In the illustrated embodiment, there is one buffer channel group 

10 (BCG) 56 with R buffer channels. In general, there could be more than one BCG 
56. The optical switching matrix (OSM) consists of a (NK+R)x{NK+R) spatial 
switch and a RxR switch with WDM (wavelength division multiplexing) FDL 
buffer serving as recirculation buffer (RB) 26 to resolve data burst contentions on 
outgoing data channels. The spatial switch is a strictly non-blocking switch, 

15 meaning that an arriving data burst on an incoming data channel can be 

switched to any idle outgoing data channel. The delay A introduced by the input 
FDL 19 should be sufficiently long such that the SCU 32 has enough time to 
process a BHP before its associated data burst enters the spatial switch. 

[0049] The RxR RB switch is a broadcast-and-select type switch of the type 
20 described in P. Gambini, et al., "Transparent Optical Packet Switching Network 
Architecture and Demonstrators in the KEOPS Project", IEEE J. Selected Areas in 
Communications, vol. 16, no. 7, pp. 1245-1259, Sept. 1998. It is assumed that the 
RxR RB switch has B FDLs with the zth FDL introducing g, delay time, l<i<B. 

It is further assumed without loss of generality that Qi <Q2 < - <Qb and Qo = 0, 
25 meaning no FDL buffer is used. Note that the FDL buffer is shared by all N input 
DCGs and each FDL contains R channels. A data burst entering the RB switch on 
any incoming channel can be delayed by one of B delay times provided. The 
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recirculation buffer in Figure lb can be degenerated to passive FDL loops by 
removing the function of RB switch, as shown in Figure 15, wherein different 
buffer channels may have different delays. 

[0050] The SCU is partially based on an electronic router. In Figure lb, the 
5 SCU has N input control channels and N output control channels. The SCU 
mainly consists of packet processors (PPs) 36, forwarders 38, a switching fabric 
40, schedulers 42, BHP transmission modules 46, a path & channel selector 44, a 
spatial switch controller 30, and a RB switch controller 28. The packet processor 
36, the forwarders 38, and the switching fabric 40 can be found in electronic 
10 routers. The other components, especially the scheduler, are new to optical 

routers. The design of the SCU uses the distributed control as much as possible, 
except the control to the access of shared FDL buffer which is centralized. 

[0051] The packet processor performs layer 1 and layer 2 decapsulation 
functions and attaches a time-stamp to each arriving BHP, which records the 

15 arrival time of the associated data burst to the OSM. The time-stamp is the sum 
of the BHP arrival time, the burst offset-time t carried by the BHP and the delay 
A introduced by input FDL 19. The forwarder mainly performs the forwarding 
table lookup to decide which outgoing CCG 52 to forward the BHP. The 
associated data burst will be switched to the corresponding DCG 54. The 

20 forwarding can be done in a connectionless or connection-oriented manner. 

[0052] There is one scheduler for each DCG-CCG pair 52. The scheduler 42 
schedules the switching of the data burst on a data channel of the outgoing DCG 
54 based on the information carried by the BHP. If a free data channel is found, 
the scheduler 42 will then schedule the transmission of the BHP on the outgoing 
25 control channel, trying to "resynchronize" the BHP and its associated data burst 
by keeping the offset time t (> 0) as close as possible to Tq . After both the data 
burst and BHP are successfully scheduled, the scheduler 42 will send the 
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configuration information to the spatial switch controller 30 if it is not necessary 
to provide a delay through the recirculation buffer 26, otherwise it will also send 
the configuration information to the RB switch controller 28. 

[0053] The data flow of scheduling decision process is shown in Figure 2. In 
5 decision block 60, the scheduler 42 determines whether or not there is enough 
time to schedule an incoming data burst. If so, the scheduler determines whether 
the data burst can be scheduled, i.e., whether there is an unoccupied space in the 
specified output DCG 54 for the data burst. In order to schedule the data burst, 
there must be an available space to accommodate the data burst in the specified 
10 output DCG. This space may start within a time window begirming at the point 
of arrival of the data burst at the spatial switch 24 extending to the maximum 
delay which can be provided by the recirculation buffer 26. If the data burst can 
be scheduled, then the scheduler 42 must determine whether there is a space 
available in the output CCG 52 for the BHP in decision block 64. 

15 [0054] If any of the decisions in decision blocks 60, 62 or 64 are negative, the 
data burst and BHP are dropped in block 65. If all of the decisions in decision 
blocks 60, 62 and 64 are positive, the scheduler sends the scheduling information 
to the path and channel selector 44. The configuration information from 
scheduler to path & chaimel selector includes incoming DCG identifier, incoming 

20 data channel identifier, outgoing DCG identifier, outgoing data channel 

identifier, data burst arrival time to the spatial switch, data burst duration, FDL 
identifier i (Q. delay tune is requested, 0<i<B). 

[0055] If the FDL identifier is 0, meaning no FDL buffer is required, the path 
& channel selector 44 will simply forward the configuration information to the 
25 spatial switch controller 30. Otherwise, the path & channel selector 44 searches 
for an idle incoming buffer channel to the RB switch 26 in decision block 68. If 
found, the path and channel selector 44 searches for an idle outgoing buffer 
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channel from the RB switch 26 to carry the data burst reentering the spatial 
switch after the specified delay inside the RB switch 26 in decision block 70. It is 
assumed that once the data burst enters the RB switch, it can be delayed for any 
discrete time from the set { , 02 /■••/&} • If this is not the case, the path & 
5 channel selector 44 will have to take the RB switch architecture into account. If 
both idle channels to and from the RB switch 26 are found, the path & channel 
selector 44 will send configuration information to the spatial switch controller 30 
and the RB switch controller 28 and send an ACK (acknowledgement) back to 
the 42 scheduler. Otherwise, it will send a NACK (negative acknowledgement) 

f I 10 back to the scheduler 42 and the BHP and data burst will be discarded in block 

'h 65. 

\y 

^';J [0056] Configuration information from the path & channel selector 44 to the 

«P spatial switch controller 30 includes incoming DCG identifier, incoming data 
^ channel identifier, outgoing DCG identifier, outgoing data channel identifier, 

1^1"^ 15 data burst arrival time to the spatial switch, data burst duration, FDL identifier i 
'!i ( Qi delay time is requested, 0<i<B). If z > 0 , the information also includes the 
J"' incoming BCG identifier (to the RB switch), incoming buffer channel identifier 

(to the RB switch), outgoing BCG identifier (from the RB switch), and outgoing 

buffer channel identifier (from the RB switch). 

20 [0057] Configuration information from path & channel selector to RB switch 
controller includes an incoming BCG identifier (to the RB switch), incoming 
buffer channel identifier (to the RB switch), outgoing BCG identifier (from the RB 
switch), outgoing buffer channel identifier (from the RB switch), data burst 
arrival time to the RB switch, data burst duration, FDL identifier i (g, delay time 

25 is requested, 1 < / < 5 ). 

[0058] The spatial switch controller 30 and the RB switch controller 28 will 
perform the mapping from the configuration information received to physical 
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components that involved in setting up the internal path(s), and configure the 
switches just-in-time to let the data burst fly-through the optical router 10. When 
the FDL identifier is larger than 0, the spatial switch controller will set up two 
internal paths in the spatial switch, one from the incoming data channel to the 
5 incoming recirculation buffer channel when the data burst arrives to the spatial 
switch, another from the outgoing buffer channel to the outgoing data channel 
when the data burst reenters the spatial switch. Upon receiving the ACK from 
the path & channel selector 44, the scheduler 42 will update the state information 
of selected data and control channels, and is ready to process a new BHP. 

10 [0059] Finally, the BHP transmission module arranges the transmission of 
BHPs at times specified by the scheduler. 

[0060] The above is the general description on how the data burst is 
scheduled in the optical router. Recirculating data bursts through the RxR 
recirculation buffer switch more than once could be easily extended from the 
15 design principles described below if so desired. 

[0061] Figure 3 illustrates a block diagram of a scheduler 42. The scheduler 42 
includes a scheduling queue 80, a BHP processor 82, a data channel scheduling 
(DCS) module 84, and a control channel scheduling (CCS) module 86. Each 
scheduler needs only to keep track of the busy/ idle periods of its associated 
20 outgoing DCG 54 and outgoing CCG 52. 

[0062] BHPs arriving from the electronic switch are first stored in the 
scheduling queue 80. For basic operations, all that is required is one scheduling 
queue 80, however, virtual scheduling queues 80 may be maintained for different 
service classes. Each queue 80 could be served according to the arrival order of 
25 BHPs or according to the actual arrival order of their associated data bursts. The 
BHP processor 82 coordinates the data and control channel scheduling process 
and sends the configuration to the path & channel selector 44. It could trigger the 
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DCS module 84 and the CCS module 82 in sequence or in parallel, depending on 
how the DCS and CCS modules 84 and 82 are implemented. 



[0063] In the case of serial scheduling, the BHP processor 82 first triggers the 
DCS module 84 for scheduling the data burst (DB) on a data channel in a desired 
5 output DCS 54. After determining when the data burst will be sent out, the BHP 
processor then triggers the CCS module 86 for scheduling the BHP on an 
associated control channel. 



[0064] In the case of parallel scheduling, the BHP processor 82 triggers the 

DCS module 84 and CCS module 86 simultaneously. Since the CCS module 86 
% 10 does not know when the data burst will be sent out, it schedules the BHP for all 

possible departure times of the data burst or its subset. There are in total B+1 
H possible departure times. Based on the actual data burst departure time reported 
J from the DCS module 84, the BHP processor 86 will pick the right time to send 

out the BHP. 

^ 15 [0065] Slotted transmission is used in data and control channels between edge 
%|i and core and between core nodes in the OBS network. A slot is a fixed-length 

time period. Let be the duration (e.g., in jis) of a time slot in data channels and 

Tj be the duration of a time slot in control channels. T^. ■ Kbits of information 



can be sent during a slot if the data channel speed is gigabits per second. 
20 Similarly, ■ Kbits of information can be sent during a slot if the control 



channel speed is gigabits per second. Two scenarios are considered, (1) - 
and (2) r^^r^. In the latter case, a typical example is that = /4 (e.g., OC-48 is 
used in control channels and OC-192 is used in data channels). 



[0066] Without loss of generality, it is assumed that is equal to multiples of 



25 . Two examples are depicted in Figures 4a and 4b (see also. Figure 18), which 
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illustrates the timestamp and burst offset-time in a slotted transmission system 
for the cases where Tj = and Tj = 4T^., with the initial offset time = . To 

simplify the description, we use ^me frame to designate time slot in control 
channels. It is further assumed without loss of generality that, (1) data bursts are 
5 variable length, in multiple of slots, which can only arrive at slot boundaries, and 
(2) BHPs are also variable length, in, for instance, multiple of bytes. Fixed-length 
data bursts and BHPs are just special cases. In slotted transmission, there is some 
overhead in each slot for various purposes like synchronization and error 
detection. Suppose the frame payload on control channels is Pj bytes, which is 

10 less than (7) ■ rj -1000/8 bytes, the total amount of information can be 

transmitted in a time frame. 

[0067] The OSM is configured periodically. For slotted transmission on data 
channels, a typical example of the configuration period is one slot, although the 
configuration period could also be a multiple of slots. Here it is assumed that the 
15 OSM is configured every slot. The length of a FDL Q needs also to be a multiple 

of slots, \ <i<B . Due to the slotted transmission and switching, it is suggested to 
use the time slot as a basic time unit in the SCU for the purpose of data channel 
scheduling, control channel scheduling and buffer channel scheduling, as well as 
synchronization between BHPs and their associated data bursts. This will 
20 simplify the design of various schedulers. 

[0068] The following integer variables are used in connection wth Figures 4a, 
4b and 5: 

tgijp : the beginning of a time frame during which the BHP enters the 

SCU; 

25 tjjj^ : the arrival time of a data burst (DB) to the optical switching 

matrix (OSM); 

: the duration/length of a DB in slots; 
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A : delay (in slots) introduced by input FDL 
T : burst offset-time (in slots). 

[0069] Each arriving BHP to the SCU is time-stamped at the transceiver 
interface, right after O/E conversion, recording the beginning of the time frame 
during which the BHP enters the SCU. For the BHPs received by the SCU in the 
same time frame, they will have the same timestamp t^j^p . For scheduling 
purpose, the most important variable is the DB arrival time to the OSM. 
Suppose a &-bit slot counter is used in the SCU to keep track of time, t^^ can be 
calculated as follows. 

[0070] tj,^ = (t,^p -Tf+k + T) mod 2". (1) 

[0071] Timestamp will be carried by the BHP within the SCU 32. Note 
that the burst offset-time r is also counted starting from the beginning of the time 
frame that the BHP arrives as shown in Figures 4a-b, where in Figure 4a, tj^^jp = 9 
and r = 6 slots, and in Figure 4b, t^^ = 2 and r = 7 slots. Suppose A = 100 slots, 
we have z'^^ = 11 5 , meaning that the DB will arrive at slot boundary 115. In 
Figures 4a-b, 1 < r < = 8 . It is assumed without loss of generality that the 
switching latency of the spatial switch in Figure lb is negligible. So the data burst 
arrival time t^^ to the spatial switch 24 is also its departure time if no FDL buffer 
is used. Note that even if the switching latency is not negligible, can still be 
used as the data burst departure time in channel scheduling as the switching 
latency is compensated at router output ports where data and control channels 
are resynchronized. 

[0072] Figure 5 illustrates a block diagram of a DCS module 84. In this 
embodiment, associative processor arrays Pm 90 and Pg 92 perform parallel 
searches of unscheduled channel times and gaps between scheduled channel 
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times and update state information. Gaps and unscheduled times are 
represented in relative times. Pm 90 and Pg 92 are coupled to control processor 
CPi 94. In one embodiment, a LAUC-VF (Latest Available Unused Channel with 
Void Filling) scheduling principle is uised to determine a desired scheduling, as 
5 described in connection with U.S. Ser. No. 09/689,584, entitled "Hardware 

Implementation of Channel Scheduling Algorithms of Optical Routers with FDL 
Buffers" to Zheng et al, filed October 12, 2000, and which is incorporated by 
reference herein. 

[0073] The DCS module 84 uses two b-hit slot counters, C and Ci. Counter C 
10 keeps track of the time slots, which can be shared with the CCS module 86. 
Counter Ci records the elapsed time slots since the last BHP is received. Both 
counters are incremented by every pulse of the slot clock. However, counter Ci is 
reset to 0 when the DCS module 84 receives a new BHP. Once counter Ci reaches 
2^-1, it stops counting, indicating that at least 2^-1 slots have elapsed since the last 
15 BHP. The value of b should satisfy 2* > W^. where is the data channel 
scheduling window. W^-Tq+A + Qs+ L^^ - S , where L,^^^ is the maximum 
length of a DB and S is the minimum delay of a BHP from O/E conversion to 
the scheduler 42. Assuming that = 8, A = 120,2^ = 32, L^^^ = 64 , and S = 40, then 

W^. =184 slots. In this case, b = S bits. 

20 [0074] Associative processor Pm in Figure 5 is used to store the unscheduled 
time of each data channel in a DCG. Let be the unscheduled time of channel 

which is stored in zth entry of Pm 0 < / < Z - 1 . Then from slot t, onwards, 
channel H, is free, i.e., nothing being scheduled, is a relative time, with respect 
to the time slot that the latest BHP is received by the scheduler. Pm has an 

25 associative memory of 2K words to store the unscheduled times and channel 
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identifiers, respectively. The unscheduled times are stored in descending order. 
For example, in Figure 6 we have K = S and t(^>t^>t^>t^>t^>t^>t^>t^. 

[0075] Siixularly, associative processor Pg in Figure 5 is used to store the gaps 
of data channels in a DCG. We use Ij and to denote the start time and ending 
5 time of gap j , 0 < j <G-\, which are also relative times. This gap is stored in /th 
entry of Pg and its corresponding data channel \5 Hj.Vg has an associative 

memory of G words to store the gap start time, gap ending time, and channel 
identifiers, respectively. Gaps are also stored in the descending order according 
4=«: to their start times . For example. Figure 7 illustrates the associative memory of 

jH 10 Pg, where lfi>l^> ^ >...> Iq_2 - ^c-i ■ ^ total number of gaps that can be 

%^ stored. If there are more than G gaps, the newest gap with larger start time will 

^ push out the gap with the smallest start time, which resides at the bottom of the 

'-'W associative memory. Note that if = 0 , then there are in total / gaps in the DCG, 

r ~ h+2 - — = ^G-i - 0- 

f|l 
s fi 

1*1 15 [0076] Upon receiving a request from the BHP processor to schedule a DB 
1*^ with departure time t^^ and duration /^g , the control processor (CPi) 94 first 
records the time slot r^.^^ during which it receives the request, reads counter Ci 
(t^ <- Cj ) and reset Ci to 0. Using t^.^i^ as a new reference time, the CPi then 
calculates the DB departure time (no FDL buffer) with respect to as 
20 r'^, = fe, 2*) mod 2^ (2) 

In the meantime, CPi updates Pm using 

t,=max(0,t,-O, 0<i<K-l (3) 
and updates Pg using the following formulas, 

=max(0,/^-/J, 0<7<G-1 (4) 
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and 

= max(0, ~tj, 0 < y < G-1. (5) 

[0077] After the memory update, CFi 94 arranges the search of eligible 
outgoing data channels to carry the data burst according to the LAUC-VF 
method, cited above. The flowchart is given in Figure 8. In block 100, and index i 
is set to "0". In block 102, Pg finds a gap in which to transmit the data burst 
t'oB+Qi. In blocks 106, Pm finds an unscheduled channel in Pm to transmit the 
data burst at t'oB+Qt. Note that the operations of finding a gap in Pg to transmit 
the DB at time t'j^^+Q, and finding an unscheduled time in Pm to transmit the DB 
at time f^B+Q are preferably performed in parallel. The operation of finding a 
gap in Pg to transmit the data burst at time f^^+Q^ (block 102) includes parallel 
comparison of each entry in Pg with ( t'^, , +Q, +1^^). If f^^ > I. and 
^'db +Q, + ^DB ^ 0 ' response bit of entry ; returns 1, otherwise it returns 0, 
0 < 7 < G - 1 . If at least one entry in Pg returns 1, the gap with the smallest index 
is selected. 

[0078] The operation finding an unscheduled time in Pm to transmit the DB at 
time /'oB+Q (block 106) includes parallel comparison of each entry in Pm with 
^'db +Q, ■ If ('db +Q, ^ tj / the response bit of entry; returns 1, otherwise it returns 0, 
0 < 7 < Z - 1 . If at least one entry in Pm returns 1, the entry with the smallest 
index is selected. 

[0079] If the scheduling is successful in decision blocks 104 or 108, then the 
CPi will inform the BHP processor 82 of the selected outgoing data channel and 
the FDL identifier in blocks 105 or 109, respectively. After receiving an ACK from 
the BHP processor 82, the CPi 94 will update Pg 90 or Pm 94 or both. If 
scheduling is not successful, z is incremented in block 110, and Pm and Pg try to a 
time to schedule the data burst at a different delay. Once Qx reaches the 
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maximum delay (decision block 112), the processors Pm and Pg report that the 
data burst cannot be scheduled in block 114. 

[0080] To speed up the scheduling process, the search can be performed in 
parallel. For example, if B=2 and three identical Pm's and Pg's are used, as shown 
in Figure 5, one parallel search will determine whether the data burst can be sent 
out at times f^^ , f^^ +Q , and ^'^^ ^Q^ . The smallest time is chosen in case that 
the data burst can be sent out at different times. In another example, if B=5 and 
three identical Fm's and Pg's are used, at most two parallel searches will 
determine whether the DB can be scheduled. 

[0081] Some simplified versions of the LAUC-VF methods are listed below 
which could also be used in the implementation. First, an FF-VF (first fit with 
void filling) method could be used wherein the order of unscheduled times in ?m 
and gaps in Pg are not sorted in a given order (either descending or ascending 
order), and the first eligible data channel found is used to carry the data burst. 
Second, a LAUC (latest available unscheduled channel) method could be used 
wherein Pg is not used, i.e., no void filling is considered. This will further 
simplify the design. Third, a FF (first fit) method could be used. FF is a 
simplified version of FF-VF where no void filling is used. 

[0082] The block diagram of the CCS module 86 is shown in Figure 9. Similar 
to the DCS module 84, associative processor Pr 120 keeps track of the usage of 
the control channel. Since a maximum of Pj bytes of payload can be transmitted 
per frame, memory T 121 of Pr 120 tracks only the number of bytes available per 
frame (Figure 10). Relative time is used here as well. The CCS module 86 has two 
hi-hit frame counters, and C( . counts the time frames. C/ records the 
elapsed frames since the receiving of the last BHP. Upon receiving a BHP with 
arrival time r^^, CP2 122 timestamps the frame during which this BHP is 
received, i.e., ti, <- . In the meantime, it reads counter C/ (// <- C{ ) and 
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reset C/ to 0. It then updates the Pt by shifting 's down by positions, i.e., 
B^_^, = B„ ti < i < 2"' - 1 , and 5, = for 2*' - < / < 2*' - 1 . In the initialization, 
all the entries in Pt are set to .' Next, CP2 calculates the frame duririg which 
the data burst will depart (assuming FDL is used) using 

4(a) = k(tDB + a)mod2*)/r, ] 0<i<B, (6) 
where is the frame length in slots. The relative time frame that the DB will 
depart is calculated from 

t'L (a) = (tL iQi ) - tL + 2*' ) mod 2\ 0 < / < 2 . (7) 

[0083] The parameter bi can be estimated from parameter b, e.g., 2*' = 2* /T^ . 
When h=8 and = 4, foi=6. The following method is used to search for the 
possible BHP departure time for a given DB departure time t (e.g., t = (Q^ ) ). 
The basic idea is to send the BHP as earlier as possible, but the offset time should 
be no larger than (as described in connection with Figures 4a and 4b). Let 
For example, when Ta=8 slots and = 1 slot, J = 8 . When Tq = 8 
slots and 7} = 4 slots, J = 2 . Suppose the BHP length is X bytes. 

[0084] In the preferred embodiment, a constrained earliest time (GET) method 
is used for scheduling the control channel, as shown in Figure 11. In step 130, Pt 
120 performs a parallel comparison of X (i.e., the length of a BHP) with the 
contents Bt-j of relevant entries of memory T 121 , , 0 < 7 < / - 1 a?? J ^ - 7 > 0 . If 
X < B,_j , entry E,_j returns 1, otherwise it returns 0. In step 132, if at least one 
enhry in Pt returns 1, the entry with the smallest index is chosen in step 134. The 
index is stored and the CCS module 86 reports that a frame to send the BHP has 
been found. If no entry in Pt returns a "1" , then a negative acknowledgement is 
sent to the BHP processor 82. (step 136) 
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[0085] The actual frame tj that the BHP will be sent out is 
(4 - 7 + 2'') mod 2*' if E,_^ is chosen. The new burst offset-time is 

{t^,modTf) + j-T^. 

[0086] After running the CET method, the CCS module 86 sends the BHP 
5 processor 82 the information on whether the BHP can be scheduled and in which 
time frame it will be sent. Once it gets an ACK from the BHP processor 82, the 
CCS module 86 will update Pt. For example, if the content in entry y needs to be 
updated, then B^^B^-X.li the BHP cannot be scheduled, the CCS module 86 
will send a NACK to the BHP processor 82. In the real implementation, the 
10 contents in Pt do not have to move physically. A pointer can be used to record 
the entry index associated with the reference time frame 0. 

[0087] For parallel scheduling, as discussed below, since the CCS module 86 
does not know the actual departure time of the data burst, it schedules the BHP 
for all possible departure times of the data burst or a subset and reports the 
15 results to the BHP processor 82. When B=l, there are three possible data burst 
departure times, f , f^^^+Q, and t\^+Q^ . Like the DCS module 84, if three 
identical Pt s are used, as shown in Figure 9, one parallel search will determine 
whether the BHP can be scheduled for the three possible data burst departure 
times. 

20 [0088] A block diagram of the path & channel selector 44 is shown in Figure 
12. The function of the path & channel selector 44 is to control the access to the 
RxR RB switch 26 and to instruct the RB switch controller 28 and the spatial 
switch controller 30 to configure the respective switches 26 and 24. The path & 
chaimel selector 44 includes processor 140 coupled to a recirculation-buffer-in 

25 scheduling (RBIS) module 142, a recirculation-buffer-out scheduling (RBOS) 
module 144 and a queue 146. The RBIS module 142 keeps track of the usage of 



22 



135818 



PATENT APPLICATION 



the R incoming channels to the RB switch 26 while the RBOS module 144 keeps 
track of the usage of the R outgoing channels from the RB switch 26. Any 
scheduling method can be used in RBIS and RBOS modules 142 and 144, e.g., 
LAUC-VF, FF-VF, LAUQ FF, etc. Note that RBIS module 142 and RBOS module 
5 144 may use the same or different scheduling methods. From manufacturing 
viewpoint, it is better that the RBIS and RBOS module use the same scheduling 
method as the DCS module 84. Without loss of generality, it is assumed here that 
the LAUC-VF method is used in both RBIS and RBOS modules 142 and 144; thus, 
the design of DCS module can be reused can be used for these modules. 

10 [0089] Assuming a data burst with duration /^g arrives to the OSM at time 
^DB and requires a delay time of Q . The processor 140 triggers the RBIS module 
142 and RBOS module 144 simultaneously. It sends the information of and 
^DB to the RBIS module 142, and the information of time-to-leave the OSM 
(toB + Qi) and to the RBOS module 144. The RBIS module 142 searches for 

15 incoming channels to the RB switch 26 which are idle for the time period of 
i^DB'^DB + ^DB )• If there are two or more eligible incoming channels, the RBIS 
module will choose one according to LAUC-VF. Similarly, the RBOS module 144 
searches for outgoing channels from the RB switch 26 which are idle for the time 
period of ( + Q , t^^ +lm+Q,)- If there are two or more eligible outgoing 

20 channels, the RBOS module 144 will choose one according to LAUC-VF. The 
RBIS (RBOS) module sends either the selected incoming (outgoing) channel 
identifier or NACK to the processor. If an eligible incoming channel to the RB 
switch 26 and an eligible outgoing channel from the RB switch 26 are found, the 
processor will send back ACK to both RBIS and RBOS module which will then 

25 update the channel state information. In the meantime, it will send ACK to the 
scheduler 42 and the configuration information to the two switch controllers 28 
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and 30. Otherwise, the processor 140 will send NACK to the RBIS and RBOS 
modules 142 and 144 and a NACK to the scheduler 42. 

[0090] The RBOS module 144 is needed because the FDL buffer to be used by 
a data burst is chosen by the scheduler 42, not determined by the RB switch 26. It 
is therefore quite possible that a data burst can enter the RB switch 26 but cannot 
get out of the RB switch 26 due to outgoing channel contention. An example is 
shown in Figure 13, where three fixed-length data bursts 148a-c arrive to the 2x2 
RB switch 26. The first two data bursts 148a-b will be delayed 2D time while the 
third DB will be delayed D time. Obviously, these three data bursts will leave the 
switch at the same time and contend for the two outgoing channels. The third 
data burst 148c is lost in this example. 

[0091] The BHP transmission module 46 is responsible for transmitting the 
BHP on outgoing control channel 52 in the time frame determined by the BHP 
processor 82. Since the frame payload is fixed, equal , in slotted transmission, 
one possible implementation is illustrated in Figure 14, where the whole memory 
is divided into Wc segments 150 and BHPs to be transmitted in the same time 
frame are stored in one segment 150. Wc is the control channel scheduling 
window, which equals to 2*' . There is a memory pointer per segment (shown in 
segment Wo, pointing to the memory address where a new BHP can be stored. To 
distinguish BHPs within a frame, the frame overhead should contain a field 
indicating the number of BHPs in the frame. Furthermore, each BHP should 
contain a length field indicating the packet length (e.g., in bytes), from the first 
byte to the last byte of the BHP. 

[0092] Suppose is the current time frame during which the BHP is received 
by the BHP transmission module and points to the current memory segment. 
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Given the BHP departure time frame t^, the memory segment to store this BHP 
is calculated from {p^ + (/^ -t^+ 2''' ) mod 2*' ) mod 2*' . 

[0093] Figure 15 shows the optical router architecture using passive FDL 
loops 160 as the recirculation buffer, where the number of recirculation channels 
R = R^+R^+.-. + Rs; with ;th channel group introducing Qj delay time, \ < j<B. 
Here the recirculation channels are differentiated while in Figure lb all the 
recirculation channels are equivalent, able to provide B different delays. The 
potential problem of using the passive FDL loops is the higher block probability 
of accessing the shared FDL buffer. For example, suppose 5 = 2 , = 4 and 
i?i = 2 , i?2 = 2 , and currently two recirculation charmels of i?, are in use. If a new 
DB needs to be delayed by Q time, it may be successfully scheduled in Figure 
lb, as there are still two idle recirculation channels. However, it cannot be 
scheduled in Figure 15, since the two channels able to delay Q are busy. 

[0094] The design of the SCU 32 is almost the same as described previously, 
except for the following changes: (1) the RBOS module 144 within the path & 
channel selector 44 (see Figure 12) is no longer needed, (2) slight modification is 
required in the RBIS module 142 to distinguish recirculation channels if 5 > 1 . 
To reduce the blocking probability of accessing the FDL buffer when 5 > 1 , the 
scheduler is required to provide more than one delay option for each databurst 
that needs to be buffered. The impact on the design of scheduler and path & 
channel selector 44 is addressed below. Without loss of generality, it is assumed 
in the following discussion that the scheduler 42 has to schedule the databurst 
and the BHP for B+1 possible delays. 

[0095] The design of DCS module 84 shown in Figure 5 remains valid in this 
implementation. The search results could be stored in the format shown in Table 
1 (assuming B=2), where the indicator (1/0) indicates whether or not an eligible 
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data channel is found for a given delay, say Q, . The memory type (0/1) indicates 
Pm or Pg. The entry index gives the location in the memory, which will be used 
for information update later on. The channel identifier column gives the 
identifiers of the channels found. The DCS module then passes the indicator 
column and the chaimel identifier column (only those with indicator 1) to the 
BHP processor. 



Table 1 : Stored search results in DCS module (5=2). 





Indicator 
(Ibit) 


Memory type 
(Ibit) 


Entry Index 
Max(log2G,log2Z)bits 


Channel identifier 
{log^K bits) 






















Q2 











[0096] The design of CCS module 86 shown in Figure 9 also remains valid. 
The search results could be stored in the format shown in Table 2 (assuming 
B=2), where the indicator (1/0) indicates whether or not the BHP can be 
scheduled on the control channel for a given DB departure time. The entry index 
gives the location in the memory, which will be used for information update later 
on. The "frame to send BHP" column gives the time frames in which the BHP are 
scheduled to send out. The CCS module then passes the indicator column and 
the "frame to send BHP" column (only those with indicator 1) to the BHP 
processor. 



Table 2: Stored search results in CCS module (5=2). 





Indicator 


Entry Index 


Frame to send BHP 




(Ibit) 


(h bits) 


(bibits) 










a 








Q2 









[0097] After comparing the indicator columns from the DCS and CCS 
modules, the BHP processor 82 in Figure 3 knows whether the data burst and its 
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BHP can be scheduled for a given FDL delay Q^,l<i<B and determines which 
configuration information will be sent to the path & channel selector 44 in Figure 
12. The three possible scenarios are, (1) the data burst can be scheduled without 
using FDL buffer, (2) the data burst can be scheduled via using FDL buffer, and 
(3) the data burst cannot be scheduled. 

[0098] In the third case, the data burst and its BHP are simply discarded. In 
the first case, the following information will be sent to the path & channel 
selector: incoming DCG identifier, incoming data channel identifier, outgoing 
DCG identifier, outgoing data channel identifier, data burst arrival time to the 
spatial switch, data burst duration, FDL identifier 0 (i.e. ). The path & 
channel selector 44 will immediately send back an ACK after receiving the 
information. In the second case, the following information will be sent to the 
path & channel selector: 

■ incoming DCG identifier, 

■ incoming data channel identifier, 

■ number of candidate FDL buffer x, 

■ for (z=l to X do) 

• outgoing DCG identifier, 

• outgoing data channel identifier, 

• FDL identifier i, 

■ data burst arrival time to the spatial switch, 

■ data burst duration. 

[0099] In the second scenario, the path & channel selector 44 will search for an 
idle buffer channel to carry the data burst. The RBIS module 142 is similar to the 
one described in connection with Figure 12, except that now it has a Pm and Pg 
pair for each group of channels with delay ^ / 1 ^ ' ^ ^ • An example is shown in 
Figure 16 for B=2, as an example. With one parallel search, the RBIS module will 
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know whether the data burst can be scheduled. When x=l, the RBIS module 142 
performs parallel search on (Pmi 90a, Pgi 92a) or (Pm2 90b, Pg2 92b), depending on 
which FDL buffer is selected by the BHP processor 82. If an idle buffer channel is 
found, it will inform the processor 140, which in turn sends an ACK to the BHP 
5 processor 82. When x=2, both (Pmi, Pgi) and (Pm2, Pgi) will be searched. If two 
idle channels with different delays are found, the channel with delay is 
chosen. In this case, an ACK together with the information that is chosen will 
be sent to the BHP processor 82. After a successful search, the RBIS module 142 
will update the corresponding Pm and Pg pair. 

10 [00100] Figures 17 - 26 illustrate variations of the LAUC-VF method, cited 

above. In the LAUC-VF method cited above, two associative processors Pm and 
Pg are used to store the status of all channels of the same outbound link. 
Specifically, Pm stores r words, one for each of the r data channels of an 
outbound link. It is used to record the unscheduled times of these charmels. Pg 

15 contains n superwords, one for an available time interval (a gap) of some data 
channel. The times stored in Pm and Pg are relative times. Pm and Pg support 
associative search operations, and data movement operations for maintaining the 
times in a sorted order. Due to parallel processing, Pm and Pg are used as major 
components to meet stringent real-time channel scheduling requirement. 

20 [00101] In the embodiment described in Figures 22-23, a pair of associative 
processors Pm and Pg for the same outbound link are combined into one 
associative processor Pmg. The advantage of using a unified Pmg to replace a pair 
of Pm and Pg is the simplification of the overall core router implementation. In 
terms of ASIC implementation, the development cost of a Pmg can be much lower 

25 than that of a pair of Pm and Pg. Pmg can be used to implement a simpler 
variation of the LAUC-VF method with faster performance. 
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[00102] In Figures 17a and 17b, two outbound channels Chi and Chz are shown, 
with being the current time. With respect to to, channel Ch has two DBs, DBi 
and DB2, scheduled and channel Chihas DBs scheduled. The time between DBi 
and DBi on Chi, which is a maximal time interval that is not occupied by any DB, 
is called a gap. The times labeled h and ti are the unscheduled time for Chi and 
Ch2, respectively. After h and ti, Chi and Chi are available for transmitting any 
DB, respectively. 

[00103] The LAUC-VF method tries to schedule DBs according to certain 
priorities. For example, suppose that a new data burst DB4 arrives at time f. For 
the situation of Figure 17a, DB4can be scheduled within the gap on Chi, or on Chz 
after the unscheduled time of Cfe. The LAUC-VF method selects Chi for DB4, and 
two gaps are generated from one original gap. For the situation of Figure 17b, 
DB4 conflicts with DBi on Chi and conflicts with DB3 on Chi. But by using FDL 
buffers, it may be scheduled for transmission without conflicting DBs on Chi 
and/ or DBs on Chi. Figure 17b shows the scheduling that DB4 is assigned to Chi, 
and a new gap is generated. 

[00104] Assuming that an outbound link has r data channels, the status of this 

link can be characterized by two sets: 

Sm =^ {{ti, i) I ti is the unscheduled time for channel Ch,} 

Sg = {(Ij, T], Cj) I < r, and the interval r,] is a gap on charmel Chq] 

[00105] In the embodiment of LAUC-VF proposed in U.S. Ser . No. 09/ 689,584, 
the two associative processors Pm and Pg were proposed to represent Sm and Sg, 
respectively. Due to fixed memory word length, the times stored in the 
associative memory M of Pm and the associative memory G of Pg are relative 
times. Suppose the current time is to. Then any time value less than to is of no use 
for scheduling a new DB. Let 

S'm= {{m&x{U - to, 0}, i) I {U , i) g Sm} 
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S'g = {(max{J; - to, 0}, max{I; - to, 0}, q) \ (Ij, rj, q) e Sg] 
The times in S m and S'g are times relative to the current time to, which is used as 
a reference point 0. Thus, M of PM and G of Pg are actually used to store S'm and 
S'g respectively. 

[00106] The channel scheduler proposed in U.S. Ser. No. 09 /689,584 assumes 
that DBs have arbitrary lengths. One possibility is to assume a slot transmission 
mode. In this mode, DBs are transmitted in units of slots, and BHPs are 
transmitted as groups, and each group is carried by a slot. A slot clock CLKs is 
used to determine the slot boundary. The slot transmissions are triggered by 
pulses of CLKs. Thus, the relative time is represented in terms of number of CLKs 
cycles. The pulses of CLKs are shown in Figure 18. In addition to clock CLKs, there 
is another finer clock CLKf. The period of CLKs is a multiple of the period CLKf. In 
the example shown in Figures 18, one CLKs cycle contains sixteen CLK/ cycles. 
Clock CLKf is used to coordinate operations performed within a period of CLKs. 

[00107] In Figures 19a and 19b, modifications to the hardware design of Pm 
and Pg given in U.S. Ser. No. 09/689,584 are provided for accommodation of slot 
transmissions. In Pm, there is an associative memory M of r words. Each word Mj 
of M is essentially a register, and it is associated with a subtracter 200. A register 
MC holds an operand. In the embodiment of Figure 19a, the value stored in MC 
is the elapsed time since the last update of M. The value stored in MC is 
broadcast to all words M^, l<i<r. Each word Mt does the following: Mj<-Mj- 
MC if Mj > MC; otherwise, Mi<^0. This operation is used to update the relative 
times stored in M. If MC stores the elapsed time since last time parallel 
subtraction operation is performed, performing this operation again updates 
these times to the time relative to the time when this new PARALLEL- 
SUBTRACTION is performed. Another operation is the parallel comparison. In 
this operation, the value stored in MC is broadcast to all words M„ 1 < i < r. Each 
word Mi does the following: if MC > Mr then MFLAd = 1, otherwise MFLAGi = 0. 
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Signals MFLAGt, l<i<r, are transformed into an address by a priority encoder. 
This address and the word with this address are output to the address and data 
registers, respectively, of M. This operation is used to find a channel for the 
transmission of a given DB. Similarly, two subtractors are used for a word, one 
5 for each sub-word, of the associative memory G in Pg. 

[00108] An alternative design, shown in Figure 19b, is to implement each word 
Mi in M as a decrement counter with parallel load. The counter is decremented 
by 1 by every pulse of the system slot clock CLKs. The counting stops when the 
counter reaches 0, and the counting resumes once the counter is set to a new 

10 positive value. Suppose that at time to the counter's value is t' and at time U > to 
the counters value is t". Then t" is the same time of t', but relative to U , i.e. t" = 
max{t' - (h - to)/ 0). Note that any negative time (i.e. t' - (h - to) < 0) with the new 
reference point h is not useful in the lookahead channel scheduling. Associated 
with each word Mi is a comparator 204. It is used for the parallel comparison 

15 operation. Similarly, a word of G in Pc can be implemented by two decrement 
counters with two associated comparators. 

[00109] The system has a c-bit circular increment counter Cs. The value of Cs is 
incremented by 1 by every pulse of slot clock CLKs. Let Uatenq/ (BHPi) be the time, 
in terms of number of S'g cycles, between the time BHPi is received by the router 
20 and the time BHPi is received by the charmel scheduler. The value c is chosen 
such that: 

max tj^,^^^ (BHP,) 
^ MAX^ 

where MAXs is the number of CLK/ cycles within a CLKs cycle. When BUFi is 
received by the router, BHPi is timestamped by operations timestamprecuiBHPi) <- 
25 Cs. When BHPt is received by the scheduler of the router, BHPi is timestamped 
again by timestampsch(BHPi) <- Cs. Let 

Di = (timestamprecv(BHPi) + 2^ - timestampsch(BHPi)) mod 2'. 
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Then, the relative arrival time (in terms of slot clock CLKs) of DB, at the optical 
switching matrix of the router is T,=A+t,+D/, where is the offset time between 
BHP, and DBr, and D is the fixed input FDL time. Using the slot time at which 
timestampsch(BHPi) <r- G is performed as reference point, and the relative times 
stored iri Pm and Pq DBi can be correctly scheduled. 

[00110] In the hardware implementation of LAUC-VF method, associative 
processors Pm and Pg are used to store and process S'm and S'c, respectively. At 
any time, S'm - [(U ,i)\l<i<r} and S'g - {(Ip rp q) 1 1, > 0}. A pair {t^ , i) in S'm 
represents the unscheduled time on channel Ch and a triple r,, q) in S'g 
represents a time gap (interval) [/;, ry] on channel Chq. The unscheduled time U 
can be considered as a semi-infinite gap (interval) [U , oo]. Thus, by includmg such 
semi-infinite gaps into S'g, S'm is no longer needed. 

[00111] More specifically, let S"m = {(ti , oo, i) \ {t, , i) g S'm}, and define S'mg = 
S"m u S'g. The basic idea of combining Pm and Pg is to build Pmg by modifying 
Pg so that Pmg is used to process S'mg. We present the architecture of associative 
processor Pmg for replacing Pm and Pg. Pmg uses an associative memory MG to 
store pairs in S'm and hiples in S'g. As G in Pg, each word of MG has two sub- 
words, with the first one for Ij and second one for ry when it is used to store (Ij, Yj, 
Cj). When a word of MG is used to store a pair (U , i) of S m, the first sub-word is 
used for U , and the second is left unused. The first r words are reserved for S'm, 
and the remaining words are reserved for S'g. The first r words are maintained in 
non-increasing order of their first sub-word. The remaining words are also 
maintained in non-increasing order of their first subword. New operations for 
Pmg are defined. 

[00112] Below, the struchires and operations of Pm and Pg are summarized, 
and the structure and operations of Pmg are defined. The differences between 
Pmg include the number of address registers used, the priority encoders, and 
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operations supported. It is shown that Pmg can be used to implement the LAUC- 
VF method without any slow-down, in comparison with the implementation 
using Pm and Pg. 

[00113] The outbound data channel of a core router has r channels 
5 (wavelengths) for data transmission. These channels are denoted by Chi, Chi, . . , 
Chr. Let S = {ti\l<i< r], where U is the unscheduled time for channel Ch. In other 
words, at any time after ti , channel Ch is available for transmission. Given a time 
T, Pm is an associative processor for fast search of T" = min{f, | U > T}, where T is 
a given time. Suppose that T" = tj, then channel C/z; is considered as a candidate 
10 data channel for transmitting a DB at time V. 

[00114] For purposes of illustration, the structures of Pm and Pg are shown in 
Figures 20 and 21 and Pmg is shown in Figure 22. 

[00115] An embodiment of Pm 210 is shown in Figure 20. Associative processor 
Pm includes an associative memory M 212 of k words. Mi, Mi, . . Mk, one for 

15 each channel of the data channel group. Each word is associated with a simple 
subtraction circuit for subtraction and compare operations. The words are also 
connected as a linear array. Comparand register MC 214 holds the operand for 
comparison. MCH 216 is a memory of 1 words, MCHi, MCHz, MCHk, with 
MCHj corresponding to Mj. The words are connected as a linear array, and they 

20 are used to hold the channel numbers. MARi 218 and MAR2 220 are address 

registers for holding addresses for accessing M and MCH. MDR 222 and MCHR 
224 are data registers used to access M and MCHR along with the MARs. 

[001 16] Associative processor Pm supports the following major operations that 
are used in the efficient implementation of the LAUC-VF charmel scheduling 
25 operations: 

RANDOM-READ: Given address x in MARi, do MDRj <- M^, MCHR 

^ MCHx. 
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RANDOM-WRITE: Given address x in MARi, do Mx <r- MDR, MCHx 

PARALLEL-SEARCH: The value of MC is compared with the values 
of all word Mj, Mi, .., Mk simultaneously (in parallel). Find the smallest; such 
5 that Mj < MQ and do AAARi ^ j, MDRi ^ Mj, and MCHR ^ MCHj. If there does 
not exist any word Mj such that Mj <MC, MARi = 0 after this operation. 

SEGMENT-SHIFT-DOWN: Given addresses a in MARi, and b in 
MARz such that a<b, perform My+i <r- Mj and MCHj+i MCHj for all a<j<b. 

Q [00117] For RANDOM-READ, RANDOM-WRITE and SEGMENT-SHIFT- 

Jj 10 DOWN operations, each pair (Mp MCHj) is treated as a superword. The output 

|j of PARALLEL-SEARCH consists r binary signals, MFLAGr, l<i<r. MFLAGi = 1 

l| if and only if M, < MC. There is a priority encoder with MFLAG^, 1 < i < r, as 
input, and it produces an address j and this value is loaded into MARi when 

^, PARALLEL-SEARCH operation is completed. RANDOM-READ, RANDOM- 

|J 15 WRITE, PARALLEL-SEARCH and SEGMENT-SHIFT-DOWN operations are 

Ij used to maintain the non-increasing order of values stored in M. 

[00118] Figure 21 illustrates a block diagram of the associative processor Pg 92. 
A Pg is used to store unused gaps of all channels of an outbound link of a core 
router. A gap is represented by a pair (/, r) of integers, where / and r are the 
20 beginning and the end of the gap, respectively. Associative processor Pc includes 
associative memory G 93, comparand register GC 230, memory GCH 232, address 
register GAR 234, data registers GDR 236 and GCHR 238 and working registers 
GRi 240 and GR2 242. 

[00119] G is an associative memory of n words, Gi, G2, . . ., G„, with each G; 
25 consisting of two sub-words Gi,i and G,,2- The words are connected as a linear 
array. GC holds a word of two sub-words, GCi and GC2. GCH is a memory of n 
words, GCHi, GCHi,.. ., GCHn with GCHj corresponding to G,. The words are 
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connected as a linear array, and they are used to hold the channel numbers. GAR 
is an address register used to hold address for accessing G. GDR, and GCHR are 
data registers used to access M and MCHR, together with GAR. 

[00120] Associative processor Pg supports the following major operations that 
5 are used in the efficient implementation of the LAUC-W channel scheduling 
operations: 

RANDOM-WRITE: Given address x in GAR, do Gx,j <^GDRi, Gx,2 <- 
GDRir GCHx ^ GCHR. 

PARALLEL-DOUBLE-COMPARAND-SEARCH: The value of GC is 
10 compared with the values of all word GtG2, ...,Gn simultaneously (in parallel). 
Find the smallest; such that Gj,! < GCi and Gj,2 > GCz. If this operation is 
successful, then do GDRi <- G/,j, GDR2 <- Gj,2, GCHR <r- GCHj , and GAR ^ j; 
otherwise, GAR <- 0. 

PARALLEL-SINGLE-COMPARAND-SEARCH: The value of GCj is 
15 compared with the values of all word Gl, G2, . . ., G„ simultaneously (in parallel) . 
Find the smallest; such that Gj,i > GC2 and; in a register GAR. If this operation is 
successful, then do GDRi G;,j, GDR2 ^ Gj,2, GCHR <- GCHj , and GAR ^ ;; 
otherwise, GAR <- 0. 

BIPARTITION-SHIFT-UP: Given address a in GAR, shift the content 
20 of G;+i to Gj ^ G;+i, GCHj <r- GCHjn, GCHj to GCHjn for a < ; <n, and Gn,i ^ 0, 

Gn,2 <- 0. 

BIPARTITION-SHIFT-DOWN: Given address a in GAR, do Q+i ^ 
Gj , GCHj^i <- GCHj, a<i <n. 

[00121] In Pg, a triple {Gi,i,Gi,2,GCH) corresponds to a gap with beginning time 
25 Gil and ending time Gi,2 on channel GCH. For RANDOM- WRITE, PARALLEL- 
DOUBLE-COMPARAND-SEARCH, PARALLEL-SINGLE-COMPARAND- 
SEARCH, BIPARTITION-SHIFT-UP, and BIPARTITION-SHIFT-DOWN 
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operations, each triple {Gi,i,Gi,2,GCHt) is treated as a superword. The output of 
PARALLEL-DOUBLE-COMPARAND-SEARCH (resp. PARALLEL-SINGLE- 
COMPARAND-SEARCH) operation consists n binary signals, GFLAd , 1< z < n, 
such that GFLAGi = 1 if and only if Gu >GCi and Gt,2 < GCi (resp. > GCi). 
5 There is a priority encoder with GFLAGy 1< i < n, as input, and it produces an 
address ; and this value is loaded into GARi when the operation is completed. 
RANDOM-WRITE, PARALLEL-SINGLE-COMPARAND-SEARCH, 
BIPARTITION-SHIFT-UP, and BIPARTITION-SHIFT-DOWN operations 
maintain the non-increasing order of values stored in G^s. 

# 10 [00122] The operations of Pm and Pg are discussed in greater detail in U.S. Ser. 
I No. 09/689,584. 

"''4 

[00123] Figure 22 illustrates a block diagram of a processor Pmg, which 
J** combines the functions of the Pm and Pg processors described above. Pmg 
H includes associative memory MG 248, comparand register MGC 250, memory 
fii 15 MGCH 252, address registers MGARj 254a and MGAR2 254b, and data registers 
Ij MGDR 256 and MGCHR 258. 

[00124] MG is an associative memory oim = r + n words, MGi, MG2, . . . ,MG,„, 
with each MGi consisting of two sub-words MGi,i and MG/,2. The words are also 
connected as a linear array. MGC is a comparand register that holds a word of 
20 two sub-words, MGCj and MGC2. MGC also holds a word of two sub-words, 
MGCi and MGC2. MGCH is a memory of m words, MGCH1MGCH2, . . . MGCHm 
with MGCHj corresponding to MG/. The words are connected as a linear array, 
and they are used to hold the channel numbers. 

[00125] Associative processor Pmg supports the following major operations: 
25 RANDOM-READ: Given address x in MGARi, do MGDRj <- MGa, 

MGDRz <- MG,,2, MCHR MGCHx. 

RANDOM-WRITE: Given address x in MGAR, do MGx,i <r~ MGDRi, 
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MGx,2 <r- MGDRi, MGCHx <r- MGCHR. 

PARALLEL-COMPOUND-SEARCH: In parallel, the value of MGCi 
is compared with the values of all superwords Md, 1 < z < m, and the values of 
MGCi and MGCi are compared with all super words MGj, r +1 <j < m, in parallel. 
5 (i) If MGC2 ^ 0, then do the following in parallel: Find the smallest / such that < 
r and MGj',1 < MGCi. If this search is successful, then do MGAR-t <r~j'; otherwise, 
MGARi ^ 0. Find the smallest j" such that r + 1 < j" < m, MGj,i < MGCi and MG/,2 
> MGC2. If this search is successful, then do MGAR2 <^j" and MGCHR <- 
MGCHy otherwise MGAR2 <- 0. (ii) If MGC2 = 0, then find the smallest / such 
Q 10 that l<i'<m and MGj',1 < MGCi. MGj,2 > MGG. If this search is successful, then 
J|{ do MGARi <- j" and MGCHR <- MGCHi; otherwise MGARi 0. 
g BIPARTITION-SHIFT-UP: Given address a in MGARi, do MGj ^ 

I MG/+1, MGCHj 4^MGCHj^i, MGCH, to MGCHjn for a <j< m, and MGn,i 0, MGn,2 

15 SEGMENT-SHIFT-DOWN: Given addresses a in MGARi, and b in 

fjj MGAR2 such that fl < fe, perform MG,+i ^ MG/ and MGCHj+i ^ MGCH/ for all a < 

W 1 < b. 

[00126] As in Pg, a triple (MG,,!, MG^, MGCHi) may correspond to a gap with 
begirming time MGi,i and ending time MGz,2 on channel MGCH. But in such a 

20 case, it must be that i > r. If i < r, then MGx,2 is immaterial, the pair (MGi,i, 

MGCHi) is interpreted as the unscheduled time MG,,j on channel MGCHi, and 
this pair corresponds to a word in Pm. For RANDOM-READ, RANDOM- 
WRITE, PARALLEL-COMPOUND-SEARCH, BIPARTITION-SHIFT-UP and 
SEGMENT-SHIFT-DOWN operations, each triple (MGi,i, MGi,2,.. ., MGCHx) is 

25 treated as a superword. The first r superwords are used for storing the 

unscheduled times of r outbound channels, and the last m - r superwords are 
used to store information about gaps on all outbound channels. 
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[00127] The output of PARALLEL-COMPOUND-SEARCH operation consists 
of binary signals MGFLAGi whose values are defined as follows: (i) if MGC2 - 0 
and MGr^i > MGCi then MGFLAG, = 1; (ii) if MGC2 #0, z < MGr,i > MGQ then 
MGFLAGt = 1; (iii) if MGCz ^ 0, i> r, MG„i > MGCi and MGr,2 < MGC2 then 
MGFLAGi = 1, or if MG^j > MGCi and i < r then MGFLAGi = 1; and (iv) otherwise, 
MGFLAGi=0. 

[00128] There are two encoders. The first one uses MGFLAGt, 1 <i<r, as its 
input, and it produces an address mMGARj after a PARALLEL-COMPOUND- 
SEARCH operation is performed if MGC2 ^^0. The second encoder uses 
MGFLAGi, r + 1 < z < m, as its input. It produces an address in MGAR2 after a 
PARALLEL-COMPOUND-SEARCH operation is performed if MGC2 ^0. There is 
a selector with the output of the two encoders as its input. If MGC2 = 0, the 
smallest non-zero address produced by the two encoders, if such an address 
exists, Ys loaded into MGARi after a PARALLEL-COMPOUND-SEARCH 
operation is performed; otherwise, MGARi is set to 0 after a PARALLEL- 
COMPOUND-SEARCH operation is performed; If MGCz 9^ Q, the output of the 
selector is disabled. 

[00129] RANDOM-READ, RANDOM-WRITE, PARALLEL-COMPOUND- 
SEARCHl, BIPARTITION-SHIFT-UP and SEGMENT-SHIFT-DOWN operations 
are used to maintain the non-increasing order of values stored in MG,i of the first 
m words, and the non-increasing order of the values stored in MGyi of the last m - 
r words. 

[00130] The operations of associative processors Pm and Pg can be carried out 
by operations of Pmg without any delay when they are used to implement 
LAUC-VF channel scheduling method. We assume that Pmg contains m = r+n 
superwords. In Table 3 (resp. Table 4), the operations of Pm (resp. Pg) given in 
the left column are carried out by operations of Pmg given the right column. 
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Instead of searching ?m and Pg concurrently, using Pmg, this step can be carried 
out by PARALLEL-COMPOUND-SEARCH operation. 



Table 3: Simulation of Pm by Pmg 



Pm 


Pmg 


RANDOM-READ 


RANDOM-READ 


RANDOM-WRITE 


RANDOM-WRITE 


PARALLEL-SEARCH 


PARALLEL-COMPOUND- 
SEARCH 


SEGMENT-SHIFT- 
DOWN 


SEGMENT-SHIFT-DOWN 

(with_MGAR2 = m) 



Table 4: Simulation of Pc by Pmg 



Pg 


Pmg 


RANDOM-WRITE 


RANDOM-WRITE 


PARALLEL-DOUBLE-COMPARAND- 
SEARCH 


PARALLEL-COMPOUND- 
SEARCH 


PARALLEL-SINGLE-COMPARAND- 
SEARCH 


PARALLEL-COMPOUND- 
SEARCH (withMGC2 = 0) 


BIPARTITE-SHIFT-UP 


SEGMENT-SHIFT-UP 
{withMGAR2 = m) 


BIPARTITE-SHIFT-DOWN 


SEGMENT-SHIFT-DOWN 
(withMGAR2 = m-l) 



[00131] In the LAUC-VF method, fitting a given DB into a gap is preferred, 
even the DB can be scheduled on another channel after its unscheduled time, as 
shown by the example of Figures 17a-b. With separate Pm and Pg and performing 
search operations on Pm and Pg simultaneously, this priority is justifiable. 
However, the overall circuit for doing so may be considered too complex. 

[00132] By combining Pm and Pg into one associative processor, simpler and 
faster variations of this LAUC-VF methods are possible. An alternative 
embodiment is shown in Figure 23. In this figure, processor P mg 270 includes an 
array TYPE 272 with m bits, each bit being associated with a corresponding word 
in memory MG. If TYPEi = 1 then MG stores an item of S'm otherwise, MG^ stores 
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an item of S'g Further, register TYPER 274 is a one-bit register used to access 
TYPE, together with MGARj and MGARi. 

[00133] Other differences between P mg and Pmg include the priority encoder 
used and the operations supported. When a new DB is scheduled, MG is 
5 searched. The fitting time interval found, regardless if it is a gap or a semi- 
infinite interval, will be used for the new DB. Once the DB is scheduled, one 
more gap may be generated. As long as there is sufficient space in MG, the new 
gap is stored in MG. When MG is full, an item of S'g may be lost. But it is 
enforced that all items of S m must be kept. 

10 [00134] Let ts'^i(DBi) and te°''t(DBi) be the transmitting time of the first and last 
slot of DBi at the output of the router, respectively. Then 

ts°^HDBi)^Tr+L; 

and 

tc'^^^DBi) - T, + Lj+length(DBrX 
15 where Ti is the relative arrival time defined above, Ly is the FDL delay time 

selected for DBi in the switching matrix and length(DB2) is the length of DBi in 
terms of number of slots. Assume that there are cj+1 FDLs Lo, Li,.. ., Lq in the DB 
switching matrix such that Lo = 0 < Lj <L2 <. . .< Eq-^ <L^. The new variation of 
LAUC-VF is sketched as follows: 

20 method CHANNEL-SCHEDULING 

begin 

success <— 0; 
for; = 0 to ^ do 

MGCi <r-T,+Lj 
25 MGCi ^Tt+Lj+ length(DBr); 

perform PARALLEL-COMPOUND-SEARCH using P mg; 

ifMGARi^fOthen 

if MGARiT^Othen 

begin 

30 output MGCHR as the number of the channel for transmitting DB, 

output Ly as the selected FDL delay time for DBi; 
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update MG of P*mg using the values in MGCi and MGCi 
success <— 1; 

exit /* exit the for-loop */ 
end 

5 endfor 

if success - 0 then drop DBi/* scheduling for DBi is failed */ 
end 

[00135] Once a DB is scheduled, MG is updated. When a gap is to be added 
into MG, and TYPEm - 1, the new gap is ignored. This ensures that no item 
10 belonging to S'm is lost. 

[00136] Associative processor P mg supports the following major operations: 
S RANDOM-READ: Given address x in MGARj, do MGDRi ^MGyi, 

^11 MGDR2^MGi,2,MCHR^MGCHr,TYFER'h-TYVEx. 

- H 

% RANDOM-WRITE: Given address x in MGARi, do MGx, i <r- MGDRi, 

1 15 MGx,i ^ MGDRi, MGCHx ^ MGCHR, TYPEx ^ TYPER. 

^ PARALLEL-COMPOUND-SEARCH: The value of MGCi is compared 

I* with the values of all superwords Md, l<i<m, and MGC2 are compared with 
fij all super words Md, 1 < i <m, whose TYPEi = 0, in parallel. Find the smallest;' 
|j such that TYPE,' = 1 and MGf,i < MGCi, or TYPE, = 0, MGj,i < MGCi and MGpi > 
^ 20 MGCz. If this search is successful, then do MGARi ^ f, TYPER <- TYPE/v MGCH 
«- MGCHj'; otherwise, otherwise MGARi <- 0. 

BIPARTITION-SHIFT-UP, SEGMENT-SHIFT-DOWN: same as in 

Pmg- 

[00137] In operation. The value of TYPEi indicates the type of information 
25 stored in MGu As in Pg, a triple (MG;,i, MGt,2, MGCHi) may correspond to a gap 
with beginning time MGi,i and ending time MGi,2 on channel MGCHi. But in such 
a case, it must be that TYPEi = 0. If TYPEi = 1, then MG,,2 is immaterial, the pair 
(MGi,i, MGCHi) is interpreted as the unscheduled time MGyi on channel MGCHi, 
and this pair corresponds to a word in Pm. For RANDOM-READ, RANDOM- 
30 WRITE, PARALLEL-COMPOUND-SEARCH, BIPARTITION-SHIFT-UP and 
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SEGMENT-SHIFT-DOWN operations, each quadruple (MGi,i, MGyi, TYPEi, 
MGCHi) is treated as a superword. 

[00138] The output of PARALLEL-COMPOUND-SEARCH operation consists 
of binary signals MCFLAd whose values are defined as follows. If MGCi^O, 
TYPE = 0, a,i > GCi and Gyz < GCi then MGFLAGi = 1. If MGC2 = 0 and G,,i > GCi 
then MGf LAG; = 1. Otherwise, MGFLAGi - 0. There is a priority encoders. If 
MGFLAGi, 1 <i<m, as its input, and it produces an address in MGARi after a 
PARALLEL-COMPOUND-SEARCH operation is performed. 

[00139] RANDOM-READ, RANDOM-WRITE, PARALLEL-COMPOUND- 
SEARCH, BIPARTITION-SHIFT-UP and SEGMENT-SHIFT-DOWN operations 
are used to maintain the non-increasing order of values stored in MGyis. 

[00140] Figure 24 illustrates the use of multiple associative processors for fast 
scheduling. Channel scheduling for an OBS core router is very time critical, and 
multiple associative processors (shown in Figure 24 as P mg processors 270), 
which are parallel processors, are proposed to implement scheduling methods. 
Suppose that there areq + l FDLs Lo = 0, Li, ...,L^ in the DB switching matrix 
such that Lo<Li< ... <Lq. These FDLs are used, when necessary, to delay DBs 
and increase the possibility that the DBs can be successfully scheduled. In the 
implementation of the LAUC-VF method presented in U.S. Ser. No.09/ 689,584, 
the same pair of Pm and Pg are searched repeatedly using different FDLs until a 
scheduling solution is found or all FDLs are exhausted. The method CHANNEL- 
SCHEDULING described above uses the same idea. 

[00141] To speed up the scheduling, a scheduler 42 may use ^ + 1 Pm/Pc pairs, 
one for each L^. At any time, <A\q + l Ms have the same content, allq + 1 MCHs 
have the same content, all ^ + 1 Gs have the same content, and allq + 1 GCHs 
have the same content. Then finding a scheduling solution for all different FDLs 
can be performed on these Pa^g pairs simultaneously. At most one search result 
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is used for a DB. All Pm/Pg pairs are updated simultaneously by the same lock- 
step operations to ensure that they store the same information. Similarly, one 
may use q + 1 Pmcs or P*mgs to speed up the scheduling. 

[00142] In Figure 24, a multiple processor system 300 uses q + 1 P*mgs 270 
5 implement the method CHANNEL-SCHEDULING described above. Similarly, 
the LAUC-VF method can be implemented using multiple Pm/Pq pairs, or 
multiple Pmgs in a similar way to achieve better performance. The multiple P mgs 
270 include cj + 1 associative memories MG^ MG^, , , MG?. Each MQ has m 
words MOi, MQz, . . ., MOm, with each MG/'i consisting of two sub-words MQt,i 

10 and MGh,2. There are g + 1 comparand registers MGCO, MGC\ MGCi. Each 
MGO holds a word of two sub-words, MGOi and MGO2. MGCHs: There are q + 
1 associative memories MGCRO , MGCH'i , .. ., MGCHi. Each MGCH has m 
words, MGCHh, MGCHh,.. ., MGCH^m. The words in MGCH are connected as a 
linear array. There are q+1 linear arrays TYPE'^, TYPE'^,..., TYPEi, where TYPE 

15 has m bits, PYPEiJYPBi, . . .JYPBm- MGARi, MGAR are address registers used 
to hold address for accessing MG and MGCH. MGDR, TYPER, MGCH are; data 
registers used to access MGs, TYPEs and MGCHR. 

[00143] This multiple processor system 300 supports the following major 
operations: 

20 RANDOM-READ: Given address x in MGARi, do MGDR3 ^ MG^a, 

MGDR2 ^ MG%2, MCHR ^ MGCH% TYPER <- TYPPOx. 

RANDOM-WRITE: Given address x in MGAPj, do MO:c,i ^ MGDPj, 
MQx,2 <r~ MGDRz, MGCHJx <^ MGCHR, TYPBx ^ TYPER, for 0 <; <q. 

PARALLEL-COMPOUND-SEARCH: For 0 <j <q, the value of MGOj 
25 is compared with the values of all superwords MGh, 1 < i < m, and MGO 2 are 
compared with all super words MOi, l<i<m, whose TYPBi = 0, in parallel. For 
0<i<q, find the smallest kj, such that TYPBkj,i = 1 and MQkj,! < MGOi, or 
TYPBig. = 0, MQk,,i < MGOi and MG%2 > MGO2. If this search is successful, let Ij 
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= 1; otherwise let Ij = 0. Find FD = min(/ 1 Ij = 1,0 <j< q}. If such / exists, then do ; 
<r- FD, MGARi <r- kj, TYPER <- TYPBkj., MGCH <r- MGCHhj.; otherwise, otherwise 
MGARi <- 0. 

BIPARTITION-SHIFT-UP: Given address a in MGARi, for 0 <j <q, do 

5 MGh ^MGhn MGCHh ^ MGCHh+i, MGCHh to MGCHh^i, for a < i < ra, and MGi„j 
<r- 0, MOn,i <- 0. 

SEGMENT-SHIFT-DOWN: Given addresses a in MGARi, and b in 
MGARz such that a<b, for 0<i<q do MQ'i <r- MGh and MGCHi+i <^ MGCHJf for 
all a </' < b. 

I! 10 [00144] A RANDOM-READ operation is performed on one copy of P mg, i.e. 
i MGO TYP£0 and MGO. RANDOM-WRITE, PARALLEL-COMPOUND- 



^ SEARCH, BIPARTITION-SHIFTUP and SEGMENT-SHIFT-DOWN operations 

1 are performed on all copes of P mg. For RANDOM-READ, RANDOM-WRITE, 

i PARALLEL-COMPOUND-SEARCH, BIPARTITION-SHIFT-UP and SEGMENT- 

15 SHIFT-DOWN operations, each quadruple (MGf,i, MGyi, TYPEi, MGCH^) is 

|| treated as a superword. When a PARALLEL-COMPOUND-SEARCH operation 

5? is performed, the output of all P mg copies are the input of selectors. The output 

¥■ of one P mg copy is selected. 



[00145] The CHANNEL-SCHEDULING method may be implemented in the 



20 



multiple processor system as: 



method PARALLEL-CHANNEL-SCHED ULING 



begin 

success <r- 0; 

for ; = 0 to g do in parallel 



25 



MGOl <r- Tx+L; 

MGO2 ^ Ti+Lj+ length(DBi); 



endfor 



perform PARALLEL-COMPOUND-SEARCH; 
if MGAi^i ?^Othen 



30 



begin 



output MGCHR as the number of the channel for transmitting DB2; 
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output L, as the selected FDL delay time for DBi 
k<FD; 

fory = 0 to do in parallel 

update MQ, 0 <] <q, using the values in MGCDR'^j and 

5 MGDRi<2 

endfor 

success <r- 1; 

end 

if success = 0 then drop DBi /* scheduling for DBi is failed */ 
10 end 

[00146] It may be desirable to be able to partition the r data channels into 
groups and choose a particular group to schedule DBs. Such situations may 
occur in several occasions. For example, one may want to test a particular 
chaimel. In such a situation, the channel to be tested by itself forms a channel 

15 group, and all other charmels form another group. Then, channel scheduling is 
only performed on the 1-channel group. Another occasions is that during the 
operation of the router, some channels may fail to transmit DBs. Then, the 
channels of the same outbound link can be partitioned into two groups, the 
group that contains all failed channels, and the group that contains all normal 

20 channels, and only normal channels are to be selected for transmitting DBs. 
Partitioning data channels also allows channel reservation, which has 
applications in quality of services. Using reserved channel groups, virtual 
circuits and virtual networks can be constructed. 

[00147] To incorporate group partition feature into channel scheduling 
25 associative processors, the basic idea is to associate a group identifier (or gid for 
short) with each channel. For a link, all the channels share the same gid belong to 
the same group. The gid of a channel is programmable; i.e. it can be changed 
dynamically according to need. The gid for a DB can be derived from its BHP 
and/ or some other local information. 

30 [00148] The design of Pm and Pg to Pm-cxi and Pc-ext may be extended to 
incorporate multiple channel groups, as shown in Figures 25 and 26, 
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respectively. As shown in Figure 25, associative processor Pm-ext 290 includes M, 
MC, MCH, MARi, MARz, MDR, MCHR, as described in connection with Figure 
20. MCIDC 292 is a comparand register that holds the gid for comparison. MGID 
294 is a memory of r words, MGIDi, MGIDi, MGIDr, with MGlDj 
corresponding to M/ and MCHj. The words are connected as a linear array, and 
they are used to hold the channel group numbers. MGIDDR 296 is a data 
register. 

[00149] Phd-ext is similar to Pm with the addition of several components, and 
modifying operations. The linear array MGID has r locations, MGIDi, MGIDz, 
MGIDr; each is used to store an integer gid. MGID; is associated with Mi and 
MCHi, i.e. a triple (Mi, MCHi, MGIDi) is treated as a superword. Comparand 
register MGIDC and data register MGIDDR are added. 

[00150] Associative processor PM-ext supports the following major operations 
that are used in the efficient implementation of the LAUC-VF channel scheduling 
operations. 

RANDOM-READ: Given address x in MARi, do MDR ^ Mx, MCH^c 
^ MCHR and GIDR <r- MGIDx. 

RANDOM-WRITE: Given address x in MARi, do Mx ^ MDR, MCHx 
<r- MCHR and MG7D, <- MGIDDR. 

PARALLEL-SEARCHl: Simultaneously, MGIDC is compared with 
the values of MGIDi, MGID2, . . ., MCIDr). Find / such that MGID, = MGIDC, and 
do MARi <r-j, MDRi ^ Mj, MCHR ^ MCHj, and MGIDDR <^ MGID+j. 

PARALLEL-SEARCH2: Simultaneously, (MC, MGIDC) is compared 
with (Ml, MGIDi), (Mi, MGID2), (M„ MGIDr) Find the smallest; such thatMj 
< MC and MGIDj =GIDC, and do MARi MDRi <- Mj, MCHR ^ MCHj, and 
MGIDDR <- MG7D;. If there does not exist any word (Mj, MGIDj) such that Mj < 
MC and MGIDj = GIDC, MARi = 0 after this operation. 

SEGMENT-SHIFT-DOWN: Given addresses a in MARi, and b in 
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MAR2 such that a<h, perform My+j <- Mj, MCHj+i <- MCH; and MGIDj+i 
MGIDj. forall a<;<b. 

[00151] For RANDOM-READ, RANDOM-WRITE and SEGMENT-SHIFT- 
DOWN operations, each triple (Mj, MCHj, MGIDj) is treated as a superword. The 
5 output of PARALLEL-SEARCHl consists r binary signals, MFLAGj, l<i<r. 
MFLAGi = 1 if and only if MGIDi = MGIDC. There is a priority encoder with 
MFLAGi, 1 < z < r, as input, and it produces an address j and this value is loaded 
into MARi when PARALLEL-SEARCHl operation is completed. The output of 
PARALLEL-SEARCH2 consists r binary signals, MFLAGi, l<i<r. MFLAd = 1 if 
Q 10 and only if Mi < MC and MGID, = MGIDC. The same priority encoder used in 
Jl PARALLEL-SEARCH 1 transforms MFLAd, l<i<r, into an address ; and this 
f ! value is loaded into MARi when PARALLEL-SEARCH operation is completed. 

RANDOM-READ, RANDOM-WRITE, PARALLEL-SEARCH2 and SEGMENT- 
\0 SHIFT-DOWN operations are used to raaintain the non-increasing order of 
15 values stored in M, 

■J;J [00152] Figure 26 illustrates a block diagram of Pc-ext- Pc-ext 300 includes 
p G,GC,GCH,GAR,GDR,GCHR, as described in connection with Figure 21. GGIDC 
302 is a comparand register for holding the gid for comparision. GGID 304 is a 
memory of r words, GGIDi, GGID 2,.. ., GGlDy, with GGfDj corresponding to Gj 
20 and GCHj. The words are connected as a linear array, and they are used to hold 
the charmel group numbers. GGJDR 306 is a data register. 

[00153] Similar to the architecture of Pyvi-exf, a linear array GGID of n words, 
GGIDi, GGID2,.. ,GGIDn is added to Pc A quadruple (Gi,i, G,2, MCH„ GGJD^ is 
treated as a superword. 

25 [00154] Associative processor Pc-ext supports the following major operations 

that are used in the efficient implementation of the LAUC-VF channel scheduling 
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operations. 

RANDOM-WRITE: Given address x in GAR, do G^,!,'^ GDRi, Gx,2 ^ 
GDRi. GCH:c <^GCHR, GGID:c <r- GGIDR. 

PARALLEL-DOUBLE-COMPARAND-SEARCH: The value of (GQ 
5 GGIDC) is compared with (Gi, GGIDj), (G2, GGIDi), (Gn, GGIDn) 

simultaneously (in parallel). Find the smallest; such that Gj,i < GCi, Gpz > GC2 
and GGID; = GGIDC. If this operation is successful, then do GDRi -f- Gjj, GDR2 
<- Gj,2, GCHR <- GCHj, GGIDR <- GGIDj and GAR <r~i; otherwise, GAR 0. 

PARALLEL-SINGLE-COMPARAND-SEARCH: (GCj,GGJDC; is 
10 compared with (Gn, GGIDi),(G2,i, GGID2), . . (Gn,i, GGIDn) simultaneously (in 
parallel). Find the smallest; such that G/,2 > GCi and GGIDj = GGIDC. If this 
»j operation is successful, then do GDR2 <- GDR2 <r- G/,2, GCHR <- GCHp 
H GGIDR <- GGJD, and GAR <- j; otherwise, GAR ^ 0. 

j BIPARTITION-SHIFT-UP: Given address a in GAR, shift the content 

15 of Q+i to Q <- Gj^i, GCHj <- GCHjn, GCHj to GCHjn, GGID, to GGIDj^i, for a ^ < 
f ' n, and Gn,i <— 0, G„„2 <- 0. 

W BIPARTITION-SHIFT-DOWN: Given address a in GAR, do G,+i ^ 

a Q, GCH,+i ^ CCH/, GGID^=i ^ GCID,, a<i<n. 

[00155] A quadruple (G^, Gx,i, GCHi, GGIDj) corresponds to a gap with 
20 beginning time Gi,i and ending time Gi,2 on channel CCHi, whose gid is in GGJDi. 
For RANDOM-WRITE, PARALLEL-DOUBLE-COMPARAND-SEARCH, 
PARALLEL-SINGLE-COMPARAND-SEARCH, BIPARTITION-SHIFT-UP, and 
BIPARTITION-SHIFT-DOWN operations, each quadruple {G,,i, Gi,2, GCUx, 
GGIDi) is treated as a super-word. The output of PARALLEL-DOUBLE- 
25 COMPARAND-SEARCH (resp. PARALLEL-SINGLECOMPARAND-SEARCH) 
operation consists n binary signals, GFLAGi, l<i<n, such that GFLAGi = 1 if and 
only if G,2 > GCi and G^2 < GC2 (resp. G,i > GCi), GGIDr =GGIDC. There is a 
priority encoder with GFLAGi, 1 < z < n, as input, and it produces an address ; 
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and this value is loaded into GAR, when the operation is completed. RANDOM- 
WRITE, PARALLEL-SINGLE-COMPARAND-SEARCH, BIPARTITION-SHIFT- 
UP, and BIPARTITION-SHIFT-DOWN operations to maintain the non-increasing 
order of values stored in Gi^is. 

5 [00156] Changing the gid of a channel Chj from gi to g2 is done as follows: find 
the triple (Mi, MCH,> MGIDi) such that MCHj, = ; and store i into MARi and 
(MDR, MCHR, MGIDDR); MGIDDR <- gi, and write back (MDR, MCHR, 
MGIDDR) using the address i in MARi. 

[00157] Given a DB', ts^^^'iDB'), fDB'j, and a gid g, the scheduling of DB' 
10 involves searches in Pu-ext and Pc-ext Searching in PM-ext is done as follows: find 
the smallest i such that Mi < ts°^'^(DB') and MGIDi = g. Searching in Pc-ext is done 
as follows: find the smallest i such that Gj,i < {5""^ (DB'}, Q,2 > is""' (DB'), and 
MGIDi = g. 

[00158] Similarly, associative processors Pc-ext and Pc*-ext can be constructed 
15 by adding a gid comparand register MGGIDC, a memory MGGID of m words 
MGGIDi, MGGID2, , MGGIDm, and a data register MGGIDDR. Pmc-cxt is a 
combination of Pjvi-exf and Pc-ext- The operations of PMG-at can be easily derived 
from the operations of Pu-ext Pc-ext since the Pu-ext items and the Pc-ext items 
are separated. In P MG-exf, the PM-ext items and the Pc-ext items are mixed. Since the 
20 MGi,j values of these items are in non-decreasing order, finding the Pm-ext item 
corresponding channel Ch can be carried out by finding the smallest; such that 
MGGID; = i. 

[00159] Although the Detailed Description of the invention has been directed 
to certain exemplary embodiments, various modifications of these embodiments, 
25 as well as alternative embodiments, will be suggested to those skilled in the art. 
The invention encompasses any modifications or alternative embodiments that 
fall within the scope of the Claims. 
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